AITopics | frankle and carbin

Collaborating Authors

frankle and carbin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Recovery Guarantee for Sparse Neural Networks

Fridovich-Keil, Sara, Pilanci, Mert

arXiv.org Machine LearningSep-25-2025

We prove the first guarantees of sparse recovery for ReLU neural networks, where the sparse network weights constitute the signal to be recovered. Specifically, we study structural properties of the sparse network weights for two-layer, scalar-output networks under which a simple iterative hard thresholding algorithm recovers these weights exactly, using memory that grows linearly in the number of nonzero weights. We validate this theoretical result with simple experiments on recovery of sparse planted MLPs, MNIST classification, and implicit neural representations. Experimentally, we find performance that is competitive with, and often exceeds, a high-performing but memory-inefficient baseline based on iterative magnitude pruning.

assumption 2, mlp, probability, (17 more...)

arXiv.org Machine Learning

2509.20323

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

8b2fc235787852ead92da2268cd9e90c-Paper-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 19:40:45 GMT

artificial intelligence, machine learning, neuron, (15 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

BINGO: A Novel Pruning Mechanism to Reduce the Size of Neural Networks

Panangat, Aditya

arXiv.org Artificial IntelligenceMay-19-2025

Over the past decade, the use of machine learning has increased exponentially. Models are far more complex than ever before, growing to gargantuan sizes and housing millions of weights. Unfortunately, the fact that large models have become the state of the art means that it often costs millions of dollars to train and operate them. These expenses not only hurt companies but also bar non-wealthy individuals from contributing to new developments and force consumers to pay greater prices for AI. Current methods used to prune models, such as iterative magnitude pruning, have shown great accuracy but require an iterative training sequence that is incredibly computationally and environmentally taxing. To solve this problem, BINGO is introduced. BINGO, during the training pass, studies specific subsets of a neural network one at a time to gauge how significant of a role each weight plays in contributing to a network's accuracy. By the time training is done, BINGO generates a significance score for each weight, allowing for insignificant weights to be pruned in one shot. BINGO provides an accuracy-preserving pruning technique that is less computationally intensive than current methods, allowing for a world where AI growth does not have to mean model growth, as well.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.09864

Genre:

Research Report (0.53)
Contests & Prizes (0.53)

Industry: Leisure & Entertainment (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

To update or not to update? Neurons at equilibrium in deep models

Bragagnolo, Andrea, Tartaglione, Enzo, Grangetto, Marco

arXiv.org Artificial IntelligenceNov-14-2022

Recent advances in deep learning optimization showed that, with some a-posteriori information on fully-trained models, it is possible to match the same performance by simply training a subset of their parameters. Such a discovery has a broad impact from theory to applications, driving the research towards methods to identify the minimum subset of parameters to train without look-ahead information exploitation. However, the methods proposed do not match the state-of-the-art performance, and rely on unstructured sparsely connected models. In this work we shift our focus from the single parameters to the behavior of the whole neuron, exploiting the concept of neuronal equilibrium (NEq). When a neuron is in a configuration at equilibrium (meaning that it has learned a specific input-output relationship), we can halt its update; on the contrary, when a neuron is at non-equilibrium, we let its state evolve towards an equilibrium state, updating its parameters. The proposed approach has been tested on different state-of-the-art learning strategies and tasks, validating NEq and observing that the neuronal equilibrium depends on the specific learning setup.

artificial intelligence, machine learning, neuron, (15 more...)

arXiv.org Artificial Intelligence

2207.09455

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Emerging Paradigms of Neural Network Pruning

Wang, Huan, Qin, Can, Zhang, Yulun, Fu, Yun

arXiv.org Artificial IntelligenceMar-11-2021

Over-parameterization of neural networks benefits the optimization and generalization yet brings cost in practice. Pruning is adopted as a post-processing solution to this problem, which aims to remove unnecessary parameters in a neural network with little performance compromised. It has been broadly believed the resulted sparse neural network cannot be trained from scratch to comparable accuracy. However, several recent works (e.g., [Frankle and Carbin, 2019a]) challenge this belief by discovering random sparse networks which can be trained to match the performance with their dense counterpart. This new pruning paradigm later inspires more new methods of pruning at initialization. In spite of the encouraging progress, how to coordinate these new pruning fashions with the traditional pruning has not been explored yet. This survey seeks to bridge the gap by proposing a general pruning framework so that the emerging pruning paradigms can be accommodated well with the traditional one. With it, we systematically reflect the major differences and new insights brought by these new pruning fashions, with representative works discussed at length. Finally, we summarize the open questions as worthy future directions.

frankle and carbin, neural network, pruning, (14 more...)

arXiv.org Artificial Intelligence

2103.0646

Country: North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre:

Research Report (0.82)
Overview (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Lottery Tickets in Linear Models: An Analysis of Iterative Magnitude Pruning

Elesedy, Bryn, Kanade, Varun, Teh, Yee Whye

arXiv.org Machine LearningAug-6-2020

The lottery ticket hypothesis [Frankle and Carbin, 2019] asserts that a randomly initialised, densely connected feed-forward neural network contains a sparse sub-network that, when trained in isolation, attains equal or higher accuracy than the full network. The method used to find these sub-networks is iterative magnitude pruning (IMP). A network is given a random initialisation, trained by some form of gradient descent for a specified number of iterations and a proportion of its smallest weights (by absolute magnitude) are deleted. The remaining weights are then reset to their initialised values and the network is retrained. This procedure can be performed multiple times, resulting in a sequence of sparse yet trainable sub-networks.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Machine Learning

2007.08243

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Gambling (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Sparse Transfer Learning via Winning Lottery Tickets

Mehta, Rahul

arXiv.org Machine LearningMay-19-2019

The recently proposed Lottery Ticket Hypothesis of Frankle and Carbin (2019) suggests that the performance of over-parameterized deep networks is due to the random initialization seeding the network with a small fraction of favorable weights. These weights retain their dominant status throughout training -- in a very real sense, this sub-network "won the lottery" during initialization. The authors find sub-networks via unstructured magnitude pruning with 85-95% of parameters removed that train to the same accuracy as the original network at a similar speed, which they call winning tickets. In this paper, we extend the Lottery Ticket Hypothesis to a variety of transfer learning tasks. We show that sparse sub-networks with approximately 90-95% of weights removed achieve (and often exceed) the accuracy of the original dense network in several realistic settings. We experimentally validate this by transferring the sparse representation found via pruning on CIFAR-10 to SmallNORB and FashionMNIST for object recognition tasks.

artificial intelligence, machine learning, ticket, (19 more...)

arXiv.org Machine Learning

1905.07785

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre:

Contests & Prizes (1.00)
Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Gambling (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback